46 research outputs found

    Variable Selection for Doubly Robust Causal Inference

    Full text link
    Confounding control is crucial and yet challenging for causal inference based on observational studies. Under the typical unconfoundness assumption, augmented inverse probability weighting (AIPW) has been popular for estimating the average causal effect (ACE) due to its double robustness in the sense it relies on either the propensity score model or the outcome mean model to be correctly specified. To ensure the key assumption holds, the effort is often made to collect a sufficiently rich set of pretreatment variables, rendering variable selection imperative. It is well known that variable selection for the propensity score targeted for accurate prediction may produce a variable ACE estimator by including the instrument variables. Thus, many recent works recommend selecting all outcome predictors for both confounding control and efficient estimation. This article shows that the AIPW estimator with variable selection targeted for efficient estimation may lose the desirable double robustness property. Instead, we propose controlling the propensity score model for any covariate that is a predictor of either the treatment or the outcome or both, which preserves the double robustness of the AIPW estimator. Using this principle, we propose a two-stage procedure with penalization for variable selection and the AIPW estimator for estimation. We show the proposed procedure benefits from the desirable double robustness property. We evaluate the finite-sample performance of the AIPW estimator with various variable selection criteria through simulation and an application

    Domain-independent Punctuation and Segmentation Insertion

    Get PDF
    Punctuation and segmentation is crucial in spoken language translation, as it has a strong impact to translation performance. However, the impact of rare or unknown words in the performance of punctuation and segmentation insertion has not been thoroughly studied. In this work, we simulate various degrees of domain-match in testing scenario and investigate their impact to the punctuation insertion task. We explore three rare word generalizing schemes using part-of-speech (POS) tokens. Experiments show that generalizing rare and unknown words greatly improves the punctuation insertion performance, reaching up to 8.8 points of improvement in F-score when applied to the out-of-domain test scenario. We show that this improvement in punctuation quality has a positive impact on a following machine translation (MT) performance, improving it by 2 BLEU points

    On optimum sensing time over fading channels for Cognitive Radio system

    Get PDF
    Cognitive Radio (CR) is widely expected to be the next Big Bang in wireless communications. In a CR network, the secondary users are allowed to utilize the frequency bands of primary users when these bands are not currently being used. For this, the secondary user should be able to detect the presence of the primary user. Therefore, spectrum sensing is of significant importance in CR networks. In this thesis, we consider the antenna selection problem over fading channels to optimize the trade off between probability of detection and power efficiency of CR systems. We formulate a target function consists of detection probability and power efficiency mathematically, and use energy detection sensing scheme to prove that the formulated problem indeed has one optimal sensing time which yields the highest target function value. Two modelling techniques are used to model the Rayleigh fading channels; one without correlations and one with correlations on temporal and frequency domains. For each model, we provide two scenarios for average SNRs of each channel. In the first scenario, the channels have distinguished level of average SNRs. The second scenario provides a condition in which the channels have similar average SNRs. The antenna selection criterion is based on the received signal strength; each simulation is compared with the worst case simulation, where the antennas are selected randomly. Numerical results have shown that the proposed antenna selection criterion enhanced the detection probability as well as it shortened the optimal sensing time. The target function achieved the higher value while maintaining 0.9 detection probability compared to the worst case simulation. The optimal sensing time is varied by other parameters, such as weighting factor of the target function

    Eccentric Exercise in Treatment of Patellar Tendinopathy in High Level Basketball Players. A Randomized Clinical Trial.

    Get PDF
    Chronic patellar tendinopathy is a common pathology in sporting population. To date, there is no agreed upon protocol as election treatment. Eccentric exercises have been used with satisfactory outcomes (3). The purpose of this trial was to compare the effects of two eccentric exercise protocols

    Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding

    Full text link
    Conversational AI systems such as Alexa need to understand defective queries to ensure robust conversational understanding and reduce user friction. These defective queries often arise from user ambiguities, mistakes, or errors in automatic speech recognition (ASR) and natural language understanding (NLU). Personalized query rewriting is an approach that focuses on reducing defects in queries by taking into account the user's individual behavior and preferences. It typically relies on an index of past successful user interactions with the conversational AI. However, unseen interactions within the user's history present additional challenges for personalized query rewriting. This paper presents our "Collaborative Query Rewriting" approach, which specifically addresses the task of rewriting new user interactions that have not been previously observed in the user's history. This approach builds a "User Feedback Interaction Graph" (FIG) of historical user-entity interactions and leverages multi-hop graph traversal to enrich each user's index to cover future unseen defective queries. The enriched user index is called a Collaborative User Index and contains hundreds of additional entries. To counteract precision degradation from the enlarged index, we add additional transformer layers to the L1 retrieval model and incorporate graph-based and guardrail features into the L2 ranking model. Since the user index can be pre-computed, we further investigate the utilization of a Large Language Model (LLM) to enhance the FIG for user-entity link prediction in the Video/Music domains. Specifically, this paper investigates the Dolly-V2 7B model. We found that the user index augmented by the fine-tuned Dolly-V2 generation significantly enhanced the coverage of future unseen user interactions, thereby boosting QR performance on unseen queries compared with the graph traversal only approach

    X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects

    Full text link
    Natural Language Generation (NLG) typically involves evaluating the generated text in various aspects (e.g., consistency and naturalness) to obtain a comprehensive assessment. However, multi-aspect evaluation remains challenging as it may require the evaluator to generalize to any given evaluation aspect even if it's absent during training. In this paper, we introduce X-Eval, a two-stage instruction tuning framework to evaluate the text in both seen and unseen aspects customized by end users. X-Eval consists of two learning stages: the vanilla instruction tuning stage that improves the model's ability to follow evaluation instructions, and an enhanced instruction tuning stage that exploits the connections between fine-grained evaluation aspects to better assess text quality. To support the training of X-Eval, we collect AspectInstruct, the first instruction tuning dataset tailored for multi-aspect NLG evaluation spanning 27 diverse evaluation aspects with 65 tasks. To enhance task diversity, we devise an augmentation strategy that converts human rating annotations into diverse forms of NLG evaluation tasks, including scoring, comparison, ranking, and Boolean question answering. Extensive experiments across three essential categories of NLG tasks: dialogue generation, summarization, and data-to-text coupled with 21 aspects in meta-evaluation, demonstrate that our X-Eval enables even a lightweight language model to achieve a comparable if not higher correlation with human judgments compared to the state-of-the-art NLG evaluators, such as GPT-4.Comment: 17 pages, 5 figures, 14 table
    corecore